npj Precision Oncology — Latest Matching Preprints

1

Catalog of gut microbiota alterations associated with anticancer therapies across multiple cancer types

Kurihara, K.; Sakai, S. A.; Sawada, K.; Iida, N.; Horasawa, S.; Fujisawa, T.; Nakamura, Y.; Kageyama, S.-I.; Bando, H.; Yoshino, T.; Tsuchihara, K.; Yamashita, R.

2026-05-09 microbiology 10.64898/2026.05.08.723707 medRxiv

Top 0.1%

18.1%

Show abstract

BackgroundAnticancer therapies can alter the gut microbiota and may affect gut bacteria associated with treatment response. However, most longitudinal studies have focused on specific cancer types or individual treatment regimens, and systematic analyses across diverse cancer therapies remain limited. We analyzed longitudinal fecal microbiota profiles using 16S ribosomal RNA gene amplicon sequencing in the pan-cancer SCRUM-Japan MONSTAR-SCREEN cohort. We included 528 paired pre- and post-treatment fecal samples from 264 patients with advanced solid tumors across 18 cancer types and characterized the gut microbiota alterations associated with 22 anticancer drugs and related clinical factors. ResultsAcross the cohort, Shannon diversity did not significantly change after treatment (mean, 3.81 vs. 3.78; P = 0.58), and pre- and post-treatment samples exhibited no clear separation in ordination space. However, within-patient analysis detected a subtle but significant longitudinal microbiota shift (paired PERMANOVA, P = 0.0001), highlighting the importance of accounting for paired sampling. Clustering of genus-level compositional alterations revealed patient groups with distinct degrees of microbiota alteration, with the largest shifts associated with antibiotic exposure, transition from normal stool to diarrhea, and specific treatment regimens. Multivariable regression analysis of 22 anticancer drugs identified drug-bacteria associations and demonstrated that drugs with similar mechanisms of action, including epidermal growth factor receptor (EGFR) inhibitors and immune checkpoint inhibitors, exhibited similar microbiota change profiles. Targeted analyses highlighted concordant reductions in the Christensenellaceae R-7 group among EGFR inhibitor-exposed patients and depletion of Faecalibacterium among immune checkpoint inhibitor-exposed patients. ConclusionsThis study provides a cross-cancer catalog of microbiota alterations associated with anticancer therapies and highlights therapy-related shifts in the gut ecosystem, including patterns shared by drugs with similar mechanisms of action.

2

Artificial intelligence-driven virtual tumorboard enhances precision care in myelodysplasticsyndromes

Swoboda, D. M.; DeZern, A. E.; England, J. T.; Venugopal, S.; Kehoe, T.; Aubrey, B. J.; Raddi, M. G.; Consagra, A.; Wang, J.; Andreadakis, J.; Rivero, G.; Stahl, M.; Zeidan, A. M.; Haferlach, T.; Brunner, A. M.; Buckstein, R.; Santini, V.; Della Porta, M. G.; Sekeres, M. A.; Nazha, A.

2026-03-27 hematology 10.64898/2026.03.26.26349088 medRxiv

Top 0.1%

15.2%

Show abstract

Background: Large language models (LLMs) perform well on standardized medical exam questions, but their reliability for complex hematology decision making is uncertain. We compared four general-purpose LLMs (GPT-4o, GPT-o3, Claude Sonnet 4, and DeepSeek-V3) with a Virtual MDS Panel (VMP), a coordinated multi-agent AI system in which domain-specialized, rule-bound software agents (WHO/ICC guidelines; IPSS-R/IPSS-M; NCCN) collaborate to generate tumor-board-level recommendations. Methods: Each model generated diagnostic, prognostic, and treatment recommendations for 30 myelodysplastic syndrome cases. Nine international MDS experts from five institutions, blinded to model identity, completed 3,000 structured ratings using 5-point Likert scales for diagnosis, prognosis, and therapy and classified errors by severity. Results: General-purpose LLMs achieved modest expert ratings (overall mean scores: 3.7 for GPT-o3, 3.2 for GPT-4o, 3.1 for DeepSeek, and 3.0 for Claude) and contained major factual errors in at least 24% of responses. The VMP increased the proportion of outputs rated 4 or higher to 87% (vs. 34-66% for general-purpose models), improved mean scores to 4.3 overall (4.3 for diagnosis, 4.4 for prognosis, and 4.1 for therapy), and reduced major errors to 8%. Conclusions: In this blinded evaluation of 30 complex MDS cases, general-purpose LLMs produced clinically important errors at rates that raise safety concerns for autonomous hematology decision making. The VMP, a rule-bound, multi-agent architecture, approached expert-level accuracy supporting its potential role as an effective decision-support tool for MDS in the future.

3

Novel polymeric fluoropyrimidine CF10 demonstrates superior therapeutic index and survival advantage in patient-derived models of 5-fluorouracil-refractory colorectal cancer

Sah, N.; Omy, T. R.; Kairamkonda, S.; Acharya, G.; Palle, H.; Luna, P.; Mani, C.; Gmeiner, W.; Cheedella, N.; Reedy, M.; Palle, K.

2026-04-08 cancer biology 10.64898/2026.04.05.716582 medRxiv

Top 0.1%

10.6%

Show abstract

BackgroundFluoropyrimidines, specifically 5-fluorouracil (5-FU), remain the cornerstone of colorectal cancer (CRC) therapy. However, intrinsic and acquired resistance, alongside dose-limiting systemic toxicities, often result in treatment failure and disease relapse. There is a pressing clinical need for next-generation fluoropyrimidines that can retain the antitumor activity in 5-FU-refractory CRC models while maintaining a favorable safety profile. MethodsWe evaluated the antitumor efficacy of CF10, a novel polymeric fluoropyrimidine designed for the sustained delivery of FdUMP, against equimolar 5-FU. We utilized a diverse panel of six patient-derived CRC organoid (PDO) models to assess 3D growth inhibition under both normoxic ([~]20% O2) and physioxic (5% O2) conditions. Mechanisms of action were investigated via {gamma}H2AX signaling (DNA damage), Annexin V/PI flow cytometry (death kinetics), and ALDEFLUOR assays (stem-like populations). Functional suppression of metastasis-associated phenotypes was evaluated using 3D Matrigel invasion assays. Finally, the therapeutic index and overall survival were validated in vivo using two independent patient-cell-derived xenograft (PCDX) models (TX-CC-199 and TX-CC-201). ResultsCF10 demonstrated significantly greater suppression of organoid growth compared to equimolar 5-FU across all patient-derived lines, regardless of morphological heterogeneity or oxygen tension. In 3D invasion assays, CF10 achieved superior anti-invasive activity even at a 10-fold lower molar dose than 5-FU. This functional advantage was mirrored by a marked depletion of the ALDH-high stem-like subpopulation, which was largely recalcitrant to 5-FU. Mechanistically, CF10 induced intensified replication stress, DNA damage and repair signaling ({gamma}H2AX, Top1cc/pRPA32, FANCD2), and pushed the CRC to irreversible/terminal, PI-positive death states. In vivo, CF10 treatment resulted in profound tumor growth inhibition and a robust survival advantage in two patient cell-derived xenograft (PCDX) models (Log-rank P<0.01) without inducing systemic weight loss or noticeable toxicity. ConclusionsBy integrating 3D patient-derived modeling with in vivo validation, we demonstrate that CF10 effectively overcomes the biological and pharmacological limitations of 5-FU. CF10 targets the aggressive, invasive, and stem-like subpopulations of CRC that drive clinical relapses. These findings provide a compelling translational rationale for the clinical development of CF10 as a superior alternative to standard fluoropyrimidines in both treatment-naive and refractory CRC. Significance StatementDespite the foundational role of 5-fluorouracil (5-FU) in colorectal cancer (CRC) therapy, resistance and systemic toxicity remain major barriers to curative outcomes. This study identifies CF10, a novel polymeric fluoropyrimidine, as a superior alternative that overcomes 5-FU resistance in biologically diverse patient-derived organoids and xenograft models. Crucially, CF10 demonstrates a unique capacity to suppress the invasive, aldehyde dehydrogenase (ALDH)-high stem-like subpopulations that likely survive standard chemotherapy (5-FU) by maintaining efficacy under physiological oxygen levels and providing a significant survival advantage in vivo with improved tolerability. CF10 represents a promising translational candidate for the treatment of both treatment-naive and refractory CRC.

4

Characterizing the Stability of Radiomics-Derived Tumor Habitats Using Image Perturbation in Head and Neck Cancer

Altinok, O.; Waqas, A.; Rasool, G.; Schabath, M. B.; Guvenis, A.

2026-06-02 radiology and imaging 10.64898/2026.05.30.26354532 medRxiv

Top 0.1%

10.5%

Show abstract

Tumor habitat imaging aims to capture intratumoral heterogeneity by grouping voxels with similar radiomic properties into spatially coherent subregions. However, radiomic features are known to be sensitive to small variations in image acquisition and processing, which can affect the stability of the resulting habitat maps. Feature repeatability is usually evaluated using test-retest scans, but such data are rarely available in clinical practice. To overcome this, we adopted an image perturbation framework, which simulates test-retest conditions by applying small, controlled changes to a single image. In head and neck cancer (HNC), where imaging is further complicated by complex anatomy, dental artifacts, and variability in tumor delineation, dedicated stability analyses are still missing. In this study, we evaluated how the repeatability of radiomic features affects habitat stability in 390 oropharyngeal cancer patients (discovery cohort). For each patient, 11 perturbed CT volumes were generated using small in-plane rotations, sub-voxel translations, and tumor-adaptive Gaussian noise. Ninety-three radiomic features were extracted from each image set, and their repeatability was assessed using the lower confidence limit of the intraclass correlation coefficient (ICC-LCL), grouped into poor, moderate, good, and excellent categories. Tumor habitats were then generated using K-means clustering (H = 3) for each feature subset, and habitat stability was measured by the Dice similarity coefficient (DSC) between habitat maps obtained from original and perturbed images. Overall, 48.4% of features were poorly repeatable and only 6.5% reached the excellent category, with first-order features being more stable than texture-based ones. Habitat stability followed a clear monotonic trend with feature repeatability: the median DSC was 0.93 for habitats generated from excellent features, 0.84 for good features, 0.75 for moderate features, and dropped to 0.41 for poorly repeatable features. Habitats generated using all features (without any repeatability-based filtering) yielded an intermediate median DSC of 0.52. All pairwise comparisons between feature subsets were statistically significant (p < 0.001). To evaluate the generalizability of these findings, the analysis was repeated in an independent external validation cohort of 372 oropharyngeal cancer patients treated at the H. Lee Moffitt Cancer Center. The stability classification showed substantial feature-level concordance between the discovery and validation cohorts (overall agreement 67.7%, quadratic-weighted Cohen's kappa = 0.78), with no feature shifting by more than two stability classes. The habitat-stability hierarchy was fully preserved in the validation cohort (median DSC of 0.87, 0.73, 0.69, and 0.39 for excellent, good, moderate, and poor features, respectively; all pairwise p < 0.001). These results show that selecting features with higher repeatability clearly improves the spatial consistency of habitat maps in HNC and support the use of perturbation-based stability analysis as a routine step in habitat imaging studies.

5

TumorArchetypeR: A modular framework to derive signature-based tumor subtypes

Luetge, M.; Nassiri, S.

2026-05-14 cancer biology 10.64898/2026.05.11.724259 medRxiv

Top 0.1%

10.5%

Show abstract

MotivationThe tumor microenvironment (TME) dictates cancer progression and therapeutic response, yet translating TME subtypes into robust clinical biomarkers remains a significant challenge. Existing classification models typically rely on static gene signatures and cohort-dependent normalization, making them ill-suited for application to the small, unbalanced datasets common in early-phase clinical trials. To better guide drug development, methods are required that offer the flexibility to target specific biological contexts and bridge the gap between the discovery of tumor archetypes and their robust translation to individual patient samples. ResultsWe developed TumorArchetypeR, a modular R package that unifies unsupervised subtype discovery with the generation of rank-based, single-sample classifiers. By leveraging a systematic parameter grid search, the framework identifies stable, data-driven subtypes rather than relying on arbitrary defaults. Crucially, to ensure clinical translatability, the package includes a module to train a robust classifier using binary gene-pair rules, enabling prediction without cohort-level preprocessing. Applying TumorArchetypeR to colorectal cancer, we resolved the heterogeneity of fibrotic tumors, distinguishing an immunosuppressive "Immune-enriched/Fibrotic" state from an immune-excluded "Fibrotic/Myeloid" phenotype. Furthermore, we identified a distinct "Th/B-cell enriched" archetype associated with superior survival, a group largely obscured by existing pan-cancer models. With our rank-based classifier demonstrating robust performance on previously unseen samples, these findings highlight TumorArchetypeR as a scalable, end-to-end solution for refining patient stratification and optimizing precision oncology strategies. The TumorArchetypeR package and documentation are openly available on GitHub at https://github.com/lutgem/TumorArchetypeR.

6

DNA methylation signatures of mismatch repair-deficient colorectal cancer

Ward, R.; Endicott, M.; Mallabar-Rimmer, B.; Burrage, J.; Sherwood, K.; Huang, Q.; Ward, J. C.; Thorn, S.; Woolley, C.; Wood, S.; Dempster, E.; Green, H. D.; Tomlinson, I.; Webster, A. P.

2026-04-13 cancer biology 10.64898/2026.04.09.717165 medRxiv

Top 0.1%

10.2%

Show abstract

BackgroundColorectal cancer (CRC) is a molecularly heterogeneous disease shaped by both genetic and epigenetic alterations. Approximately 15% of CRCs display widespread CpG island hypermethylation, known as the CpG Island Methylator Phenotype (CIMP). CIMP-high (CIMP-H) tumours frequently exhibit MLH1 promoter hypermethylation, leading to mismatch repair deficiency (MMRd) and microsatellite instability (MSI). However, DNA methylation patterns associated with MSI, independent of CIMP and MLH1 silencing, and the influence of clinical variables such as anatomical location and patient age on the CRC methylome remain poorly characterised. MethodsWe performed epigenome-wide DNA methylation profiling of 259 primary CRC tissue samples using the Illumina EPICv2 array, comparing differential methylation between MSI and microsatellite stable (MSS) CRC, adjusting for tumour purity, MLH1 promoter methylation, CIMP status, and anatomical location, to account for known confounders. We further evaluated the independent effects of anatomical location and patient age on global methylation patterns. ResultsEpigenome-wide differential methylation between MSS and MSI CRC was dominated by MLH1 promoter hypermethylation. After adjusting for MLH1 hypermethylation and CIMP status, we identified a distinct set of 656 CpG sites associated with MMRd independent of MLH1 silencing. These included hypermethylation at LRP6, GSK3{beta}, and CDK12, implicating altered WNT signalling and transcriptional regulation pathways. Comparison of MSI subgroups revealed the co-occurrence of MLH1 hypermethylation with promoter hypermethylation at TXNRD1. Anatomical location showed a strong independent effect on methylation patterns, while we observed only modest effects of patient age on the CRC methylome after adjustment for confounders. ConclusionsWe identified a distinct methylation profile distinguishing MSS and MSI CRC, including MLH1-independent markers of MMRd, as well as novel differentially methylated loci within MSI subgroups. We further showed that anatomical location has a strong independent impact on the CRC methylome. Together, these findings refine the molecular characterisation of CRC and highlight potential epigenetic markers that could inform patient stratification and precision oncology.

7

Quantifying Treatment Resistance in Mixtures of Gastrointestinal Stromal Tumor Cells with BARMIX

Darbalaei, M.; Muhlenberg, T.; Zummack, J.; Dujardin, P.; Grunewald, S.; Baginska, A.; Munteanu, P.; Martinez Cruz, M.; Dorsch, M.; Schramm, A.; Bauer, S.; Hoffmann, D.; Gruner, B. M.

2026-03-25 systems biology 10.64898/2026.03.23.713602 medRxiv

Top 0.1%

9.9%

Show abstract

Targeted therapies in gastrointestinal stromal tumors (GIST) often fail due to heterogeneous resistance mutations arising across metastatic sites. Efficient, rational design of mutation-specific therapies requires the ability to quantify treatment resistance across many genotypes in parallel. Here, we present BARcode MIXture analysis (BARMIX), a platform combining multiplexed experiments with DNA-barcoded cancer cell mixtures in vitro and in vivo, and a probabilistic framework for quantitative assessment of genotype-specific treatment resistance. BARMIX efficiently and accurately recapitulated known clinical resistance patterns in GIST and matched resistance measurements from individual cell lines in vitro and in vivo. This experimental-computational approach provides a scalable and broadly applicable strategy for quantifying treatment responses in complex cell populations, enabling systematic preclinical testing of new drugs and combinations to identify mutation-specific therapeutic options for precision oncology in GIST and beyond.

8

dbGIST: An LLM-Assisted Multi-Omics Resource for Target Exploration and Cross-Dataset Validation in Gastrointestinal Stromal Tumors

Sun, Z.; Zhao, Q.; Li, J.-H.; Li, J.-J.; Liu, H.; Guo, Y.-X.; Tang, Y.-D.; Yang, F.; Liu, X.; Peng, S.-F.; Mi, W.-n.; Zhang, G.; Zhang, Z.; Yuan, M.-L.; Li, G.-H.; Wang, Y.-F.; Liu, C.; Li, S.-L.; Yang, J.-H.; Fu, Y.

2026-05-26 cancer biology 10.64898/2026.05.22.727292 medRxiv

Top 0.1%

9.9%

Show abstract

Gastrointestinal stromal tumors (GISTs) are the most common mesenchymal neoplasms of the gastrointestinal tract, yet GIST-specific omics evidence remains scattered across small cohorts and is not represented as a dedicated disease project in major cancer genomics resources, limiting reproducible target exploration. Here, we present dbGIST (https://www.dbgist.com), a dedicated GIST-focused multi-omics resource built to make dispersed GIST evidence searchable, analyzable, and reusable. dbGIST harmonizes data from 37 centers and 1,991 samples, including pathologically verified in-house cohorts, across genomics, bulk transcriptomics, proteomics, phosphoproteomics, and single-cell transcriptomics, and couples these data with curated clinical annotations covering survival, mutation status, risk stratification, metastasis or recurrence, mitotic index, tumor site and size, and imatinib response. The platform supports cohort-level molecular-clinical association, survival, enrichment, immune-infiltration, drug-sensitivity, and single-cell analyses through interactive visualizations, downloadable source data, and public APIs for programmatic access to reusable analysis outputs and visualization-ready data. An optional LLM-assisted interface helps users navigate analyses and interpret outputs. Using MCM7 as a case study, dbGIST linked a resource-derived candidate to survival, risk features, metastatic or recurrent disease, imatinib-response phenotypes, proliferative cell states, and in vitro GIST-cell behavior. dbGIST therefore provides a traceable and interoperable resource for target exploration and precision oncology research in GIST.

9

Spatially resolved transcriptomic and proteomic profiling reveals cell interaction programs that predict Barrett's esophagus progression

Monarez, I. D.; Kim, E. N.; Moon, K.; Baker, A.-M.; Chen, P. Z.; Bressan, D.; Miremadi, A.; di Pietro, M.; Hannon, G. J.; Graham, T. A.; Fizgerald, R. C.; Chang, Y. H.; Zhuang, L.

2026-05-12 systems biology 10.64898/2026.05.08.723546 medRxiv

Top 0.1%

9.2%

Show abstract

Barretts esophagus (BE) is the precursor lesion of esophageal adenocarcinoma (EAC). It affects approximately 5% of adults in the United States and significantly increases the risk of developing EAC. However, current surveillance strategies cannot reliably distinguish patients who will progress from those who will remain stable. Direct studies of progressor BE are extremely limited due to availability of tissue with known progression outcomes, and have largely been restricted to genomic profiling approaches. The premalignant cellular landscape of progressor BE remains poorly understood. Here, we used complementary spatial transcriptomic and proteomic imaging to profile 34 non-dysplastic BE patients under endoscopic surveillance, including those who subsequently progressed to dysplasia or EAC, termed "Progressors" and those who remained stable, termed "Non-progressors". Transcriptomics based Xenium analysis captured 974,604 cells across 70 whole-biopsy regions, while protein based imaging mass cytometry profiled 372,242 cells across 119 selected regions. FUME-TCRseq further quantified T cell clonotypes from matched tissues scrolls. Cellular composition was generally similar between Progressors and Non-progressors. However, Progressors showed increased intestinal Barretts columnar cells, B cells and gastric progenitor-like cells, together with enhanced immune-epithelial interactions, whereas Non-progressors retained coordinated stromal organization. Spatial interaction features strongly outperformed cell composition and density for progression prediction. Combined spatial interaction model achieved an area under the curve (AUC) of 0.97, compared with 0.62 and 0.68 for comparison and density alone. Complementary imaging mass cytometry further resolved the underlying immune programs, identifying cytotoxic and antigen presenting myeloid features enriched in progressors, and CD56 associated memory T cell interactions enriched in non progressors. Together, these findings support a model that BE progression is driven by progressive remodeling of epithelial-immune-stromal architecture rather than emergence of distinct dysplasia-like cell subsets. Increased T cell clonal diversity and recruitment of cytotoxic and antigen-presenting immune niches may also reflect an evolving response to genomic alteration prior to dysplasia. These results establish spatial tissue architecture, rather than specific cell types, captures progression associated microenvironmental states in BE and provides a framework for spatially informed patient stratification and early cancer risk assessment.

10

Mechanistic learning to predict and understand minimal residual disease

Marzban, S.; Robertson-Tessi, M.; West, J.

2026-04-21 cancer biology 10.64898/2026.04.16.718968 medRxiv

Top 0.1%

8.5%

Show abstract

Mechanistic modeling has long been used as a tool to describe the dynamics of biological systems, especially cancer in response to treatment. Their key advantage lies in interpretability of relationships between input parameters and outcomes of interest. Mechanistic models may also be calibrated to a cohort of patients and scaled up to generate a simulated set of virtual patients whose aggregate behavior reproduces key characteristics of the real patient population. In contrast, machine learning techniques offer strong prediction performance, especially for high dimensional datasets that are common in oncology. Here, we employ a Mechanstic Learning framework that combines the advantages of both approaches by training machine learning models on mechanistic parameters inferred from clinical patient data. We assess the ability of virtual clinical cohorts for the purpose of 1) scaling up small cohort sizes and 2) balancing unbalanced patient subgroups in the setting of BCR::ABL1 positive lymphoblastic leukemia. Our mechanistic model (a Markov chain model) contains sixteen parameters that describe the rate of cell fate transitions that occur in patients with B-cell precursor acute lymphoblastic leukemia. The machine learning (a ridge logistic regression model) is trained on these parameters to predict two clinically-relevant features: BCR::ABL1 fusion gene status (positive or negative) and minimal residual disease status (positive or negative) post-induction chemotherapy. Model training is done in an iterative fashion to assess which (and how many) parameters are critical to maintain high predictive performance. Using machine learning models trained on the clinical flow-cytometry data, we find that the stem-like cell state alone is the most predictive feature for both BCR::ABL1-positive and MRD-positive disease, with composite scores (defined as the average of accuracy, balanced accuracy, and area under the curve) of 0.80 and 0.67, respectively. By comparison, mechanistic learning achieves comparable or improved composite scores for BCR::ABL1-positive and MRD-positive disease, with scores of 0.81 and 0.71, respectively, using only de-differentiation for BCR::ABL1 and stem-state persistence together with differentiation-directed exit for MRD. Virtual Patient (VP) expansion is informative for robustness analysis and class balancing, but full cohort expansion introduced additional heterogeneity, reduced predictive performance, and required larger models, whereas VP-based balancing yielded only a modest gain over class weighting at substantially greater computational cost. In summary, a mechanistic-learning approach not only preserves predictive performance, but also provides a biological hypothesis for why stemness is predictive of these clinically relevant outcomes.

11

Genotype and methylation interact to reconfigure transcriptional regulation in colorectal cancer

Kim, B.; Kim, H.; Kwon, M.-K.; Hannenhalli, S.; Choi, S. S.

2026-05-30 bioinformatics 10.64898/2026.05.27.728350 medRxiv

Top 0.1%

8.5%

Show abstract

BackgroundTranscriptional regulation is shaped by both genomic variants and the environment. Yet, how the regulatory effects of genomic variants are reconfigured by dynamic epigenomic changes during tumorigenesis remains incompletely understood. MethodsWe investigated methylation context-dependent links between genotype and gene expression in colorectal cancer (CRC) using paired tumor and normal-adjacent tissue (NAT) from 80 patients, thereby controlling for germline genomic background. By integrating promoter-targeted bisulfite sequencing with RNA-seq, we systematically compared expression quantitative trait loci (eQTLs) and methylation quantitative trait loci (mQTLs). To capture regulatory complexity beyond simple mediation, we implemented a memo-eQTL framework that explicitly models genotype x DNA methylation (GxM) interactions. ResultsWe observed extensive tissue specificity in both eQTL and mQTL landscapes; tumor-specific eGenes were significantly enriched for hallmark oncogenic pathways, including WNT and MAPK signaling. Standard mediation models explained only a minority of genotype-expression relationships, whereas our explicit interaction framework revealed widespread reconfiguration of methylation-dependent genetic effects in tumors. Memo-eQTL mapping (FDR < 0.05) identified 18 NAT and 73 tumor eGenes with significant GxM interactions, and results were consistent at a more permissive threshold (FDR < 0.2). We further developed a patient-level memo-eQTL score and found that interaction-based regulatory disruption in NAT, but not in tumor, significantly correlated with clinical stage (P = 0.035). ConclusionsGenetic regulation in cancer is reorganized through context-dependent GxM interactions. Importantly, GxM signatures in NAT are specifically linked to disease progression, offering new insights into field cancerization and the clinical consequences of regulatory reprogramming in CRC.

12

Addressing the Global Diagnostics Gap for Childhood Leukemias: A Global, Multisite Type 2 Hybrid Validation Study of Nanopore-based Adaptive Sampling Whole Genome Sequencing

Alexander, T. B.; Islam, R.; Aijaz, J.; Achterberg, T.; Bolous, N.; Cammel, K.; de Ridder, J.; Geyer, J.; Gray, S.; Groenewegen, N.; Hussain, S.; Imran, S.; Jamal, S.; Kar, S.; Kanavy, D.; Mansoor, N.; Parihar, M.; Saha, V.; Tops, B.; van Tuil, M.; Wilkins, D.; Weck, K.; Wu, G.; Zhou, L.; Kester, L.; Wang, J. R.; Bhakta, N.

2026-05-21 hematology 10.64898/2026.05.19.26353434 medRxiv

Top 0.1%

8.4%

Show abstract

Background: Modern therapy for childhood and adolescent leukemia requires accurate risk classification of genomic subtype. Although short-read next-generation sequencing (NGS)- based approaches provide comprehensive clinical diagnostics in limited, highly resourced settings, they remain expensive, slow, and inaccessible to most children worldwide. Transformative approaches are needed to improve diagnostic classification for leukemia globally. Methods: We simultaneously continued to develop an analytical pipeline NASVar (Nanopore variant calling for adaptive sampling), and conducted a multicenter, type-two hybrid clinical validation study of an Oxford Nanopore Technologies (ONT) adaptive-sampling whole-genome sequencing (asWGS) assay across hospitals with varying diagnostic resources. In preparation for implementation, a global panel developed a leukemia-based standardized gene set and consensus laboratory-developed test (LDT) validation guidelines. Measures of assay effectiveness compared to both conventional and orthogonal NGS methods, where available, were simultaneously collected with data to measure the implementation outcomes of feasibility, fidelity, appropriateness, and cost. Results: All four centers successfully completed the LDT validation, with minimal adaptations required for regulatory compliance. A total of 457 specimens were sequenced (331 B-ALL, 83 AML, 43 T-ALL). For the 210 B-ALL cases with locally resolved genomic subtypes defined by DNA alterations, asWGS was 100% concordant (210/210). Cases locally defined as B-other were resolved via asWGS with disease-defining DNA alterations in 47% (49/105) of cases. An additional 41% (43/105) of locally defined B-other cases were classified by incorporation of DNA methylation, and all 16 B-ALL patient-derived xenograft controls were correct, for a total of 96% (318/331) of all B-ALL cases in the cohort resolved with single assay asWGS. For AML, 97% (56/58) of cases with locally resolved genomic subtypes were identified by automated asWGS analysis, while an additional two cases were identified after targeted manual review. At Indus Hospital in Pakistan, the B-ALL and AML diagnostic genomic subtype yield increased from 28% with local standard of care diagnostic testing, to 84% with asWGS. The cost of reagents and consumables in the United States, assuming pooled three-plexing, was $343/sample. Based on the combined hybrid validation results, all centers are independently preparing for clinical return of results. Conclusions: ONT asWGS was successfully validated as a clinical assay in four diverse hospital settings. As a single, multi-omic platform that delivers value across the continuum of high-resource to resource-limited contexts, the approach offers a disruptive solution to address the global equity gap in cancer diagnostics.

13

Pan-cancer survival modeling reveals structural limits of genomic feature integration in immunotherapy outcomes

Hassan, W.; Adeleke, S.

2026-04-18 bioinformatics 10.64898/2026.04.15.718634 medRxiv

Top 0.1%

8.3%

Show abstract

BackgroundImmune checkpoint inhibitors (ICIs) have improved outcomes across multiple cancer types, yet reliable predictors of survival remain limited. While genomic features such as tumor mutational burden (TMB) are widely used, their contribution to predictive modeling in heterogeneous real-world cohorts remains unclear. We evaluated the relative contributions of clinical and whole-genome sequencing (WGS) features in pan-cancer survival modeling. MethodsWe analyzed 658 patients treated with ICIs with matched WGS data from the Genomics England. Using a leakage-controlled machine learning framework with strict train-test separation, we compared four models: TMB-only, clinical-only, clinical+TMB, and an integrated 11-feature clinico-genomic XGBoost survival model. Model performance was assessed using Harrells concordance index (C-index) with bootstrap confidence intervals. ResultsTMB alone demonstrated near-random discrimination (C-index 0.50; 95% CI 0.44-0.56). Clinical variables substantially improved predictive performance (0.59; 95% CI 0.53-0.64), with marginal gain from adding TMB (0.59). The integrated model achieved a C-index of 0.60 (95% CI 0.55-0.65). While improvement over TMB alone was significant, incremental gain beyond optimized clinical models was modest. Feature attribution analysis showed that model performance was dominated by clinical variables, with genomic features contributing limited additional signal. ConclusionsThese findings suggest that, in heterogeneous pan-cancer cohorts, predictive performance is constrained by the underlying data structure, in which dominant clinical signals overshadow genome-scale features. This study highlights fundamental limitations in integrating genomic data into survival models across diverse cancer types and provides a benchmark for future computational approaches.

14

A network-based deep learning model integrating subclonal architecture for therapy response prediction in cancer

Kim, S.; Ha, D.; Nam, A.-r.; Cheong, S.; Lee, J.; Kim, S.; Park, S.

2026-03-17 cancer biology 10.64898/2026.03.14.711567 medRxiv

Top 0.1%

8.2%

Show abstract

Predicting treatment response remains challenging in oncology, particularly given the growing diversity of therapeutic options. Despite efforts using gene expression signatures, or integrative multi-omics frameworks, robust and interpretable biomarkers remain limited. We present SubNetDL, a deep learning framework that integrates subclonal mutation profiles and protein-protein interaction networks via network propagation. Unlike condition-specific approaches, SubNetDL leverages somatic mutations alone and is applicable across diverse cancer types and treatment modalities. Applied to ten TCGA cancer-drug combinations, SubNetDL achieved consistently strong performance (median AUROC = 0.74) and successfully generalized to two independent immunotherapy datasets (median AUROC = 0.77). Importantly, it identified candidate biomarker genes with treatment-specific relevance. SubNetDL prioritized genes that were not central in the network, highlighting its ability to capture context-specific patterns beyond traditional metrics. In conclusion, our approach offers a robust and interpretable framework for identifying predictive biomarkers and stratifying patients based on mutation profiles and network context. O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=200 SRC="FIGDIR/small/711567v1_ufig1.gif" ALT="Figure 1"> View larger version (55K): org.highwire.dtl.DTLVardef@d6605org.highwire.dtl.DTLVardef@1a50594org.highwire.dtl.DTLVardef@1114deeorg.highwire.dtl.DTLVardef@1137504_HPS_FORMAT_FIGEXP M_FIG C_FIG MotivationIntratumoral heterogeneity is a fundamental driver of therapeutic resistance, yet most predictive models rely on aggregate mutational burdens or static gene expression signatures, overlooking the subclonal dynamics that shape treatment outcomes. While network biology offers a functional lens to interpret genomic alterations, a framework that explicitly bridges subclonal architecture with system-level molecular interactions has been lacking. To address this, we developed SubNetDL, a deep learning framework that integrates patient-specific subclonal profiles with protein-protein interaction networks. By leveraging only somatic mutation data, SubNetDL captures the functional convergence of subclonal evolution, providing a robust and interpretable platform for patient stratification and biomarker discovery across diverse oncological contexts.

15

Consensus Through Diversity: A Comprehensive Benchmark of Multi-Omic Approaches for Precision Breast Oncology

Sionakidis, A.; Pinilla Alba, K.; Abraham, J.; Simidjievski, N.

2026-04-21 bioinformatics 10.64898/2026.04.17.719159 medRxiv

Top 0.1%

7.2%

Show abstract

Emerging multi-omic profiling has made it feasible to subtype disease using multiple molecular layers. However, inconsistent preprocessing, heterogeneous implementations, variable evaluation, and limited reproducibility often constrain method selection. Here, we systematically benchmark 22 publicly available unsupervised approaches for bulk data on the TCGA-BRCA cohort across five modalities (RNA-seq, miRNA, DNA methylation, copy numbers, single nucleotide polymorphisms) and validate findings in two independent datasets, enabling a multi-layered comparison of performance, heterogeneous data support and interpretability. Most approaches fuse multi-omic data to produce a two-cluster solution largely aligned with ER status, with higher-resolution approaches further refining these into four coherent subclasses (angiogenic luminal, oxidative-phosphorylation/HER2-low luminal, immune-inflamed basal-like, and hyper-proliferative basal-like). Our benchmarking results indicate that methods based on similarity networks can efficiently produce stable, reliable partitions. Matrix factorisation and Bayesian factorisation algorithms produce rich latent representations, allowing quantification of feature and modality contributions, albeit at higher computational cost. Consensus clustering can be used on a case-by-case basis and refine partitions into more robust and generalisable findings. We aggregate our insights into a decision workflow that aligns with study goals, data characteristics, and computational resources, enabling optimal analytic strategies. This comprehensive assessment provides a practical roadmap for investigators seeking to extract reproducible, biologically meaningful subtypes from complex multi-omic datasets. We higlight the different technical and practical benefits and trade-offs that shape the selection and development of multi-omic approaches applied in precision oncology.

16

Distinct Spatial Programs of Response versus Resistance in Non-Small Cell Lung Cancer after Neoadjuvant Chemoimmunotherapy

Park, S. H.; Koh, J.; Bae, S.; Choi, H.; Yun, T.; Park, J. H.; Na, B.; Park, S.; Lee, H. J.; Park, I. K.; Kang, C. H.; Kim, Y. T.; Na, K. J.

2026-04-07 cancer biology 10.64898/2026.04.05.716543 medRxiv

Top 0.1%

7.2%

Show abstract

BackgroundNeoadjuvant chemoimmunotherapy (nCIT) has become a standard treatment for locally advanced resectable non-small cell lung cancer (NSCLC), yet the spatial biology underlying treatment resistance remains poorly understood. We used spatial transcriptomics to define the microenvironmental architecture of residual cancers in patients who did not achieve major pathologic response (non-MPR) compared with those who did (MPR). MethodsSpatial transcriptomics was performed on 10 formalin-fixed paraffin-embedded (FFPE) tumor blocks (5 MPR, 5 non-MPR) obtained from 8 patients treated with nCIT. A deep learning algorithm was applied to detect viable residual cancer spots from treatment-induced fibrosis and necrosis. Spatial deconvolution, distance modeling, ligand-receptor analysis, and functional pathway scoring were integrated to characterize niche-specific programs. ResultsMPR cancer core displayed an immune-permissive remodeling environment with deep infiltration of cytotoxic CD8+ T cells, mature dendritic cells (LAMP3+, CCR7+), and active efferocytosis signaling (APOE-TREM2), alongside robust MHC class II expression. Non-MPR cancer core, by contrast, exhibited spatial immune exclusion: a dense fibroblast barrier reinforced by TIMP1-CD63 signaling and Treg-enriched boundaries physically restricted effector T cell access to the cancer core. Residual cancer cells in non-MPR samples maintained active cell cycling and independently upregulated cytochrome P450-mediated drug detoxification and DNA damage response pathways without inducing MHC class II expression -- effectively decoupling intrinsic survival from immune recognition. The non-MPR core also showed a hyper-metabolic profile, including elevated glutathione metabolism consistent with antioxidant buffering against chemotherapy-induced oxidative stress. TROP2 was broadly expressed across the non-MPR cancer core and co-localized with DNA damage response and nuclear factor erythroid 2-related factor 2 resistance signatures. ConclusionsResidual cancer cores in non-MPR tumors appear to represent evolved resistant niches sustained by structural immune exclusion, metabolic rewiring, and DNA repair proficiency. These findings highlight the spatial co-localization of epithelial anchors, such as TROP2, with intrinsic resistance pathways, providing a structural rationale for developing novel precision therapeutic strategies to bypass stromal barriers and overcome the cancer cores intrinsic repair capacity.

17

Variant-Level Functional Classification of Monoallelic TP53 Mutations Refines Prognostic Stratification in Myelodysplastic Neoplasms Beyond Allelic Status

Streuer, A.; Ochi, Y.; Riabov, V.; Nannya, Y.; Steiner, L.; Abba, M.; Metzgeroth, G.; Altrock, E.; Rapp, F.; Nowak, V.; Hepgueluem, E.; Nowak, D.; Hofmann, W.-K.; Ogawa, S.; Schmitt, N.

2026-03-20 hematology 10.64898/2026.03.18.26348425 medRxiv

Top 0.1%

6.7%

Show abstract

TP53 mutations represent one of the strongest adverse prognostic factors in myelodysplastic neoplasms (MDS). While multi-hit TP53 (TP53multiHit) alterations uniformly lead to very poor outcomes, the prognostic relevance of monoallelic TP53 (TP53mono) mutations remains controversial. TP53 variants can cause loss-of-function, dominant-negative, or gain-of-function effects. We hypothesized that functional heterogeneity among TP53 variants contributes to the variable clinical behavior observed in monoallelic TP53-mutated MDS. Therefore, we analyzed pretreatment samples from 4,505 patients with MDS from two independent cohorts (IWG, n=3,173; J-MDS, n=1,332), including 271 patients with TP53mono and 499 with TP53multiHit. Functional annotation of TP53 variants was performed using a previously published phenotype score (PS) derived from saturation mutagenesis screens, capturing dominant-negative and loss-of-function effects. Median overall survival (OS) differed significantly by TP53 allelic state (TP53 wild-type (TP53wt) 42.4 months; TP53mono 22.9 months; TP53multiHit 9.2 months; p < 0.001). Within the TP53mono subgroup, functional annotation identified marked heterogeneity. Patients with high PS ([≥]7) showed significantly inferior OS compared with those with low PS (median OS: 13.8 vs. 39.2 months; HR 1.68, 95% CI 1.16-2.42; p = 0.006), particularly for IPSS-R and IPSS-M low-risk cases. Combining PS and variant allele frequency (VAF) further improved risk stratification. TP53mono patients with PS [≥]7 and VAF [≥]22% had outcomes comparable to TP53multiHit (median OS: 8.8, p = 0.2), whereas those with PS <7 and VAF <22% exhibited survival similar to TP53wt (median OS: 49.7, p = 0.9). Overall, functional annotation of TP53 variants refines prognostication in TP53mono-mutated MDS and may enhance individualized risk assessment.

18

Basal gland localization and focal distribution of OLFM4-expressing cells in increasing severity of gastric intestinal metaplasia

Sathe, A.; Meka, R.; Geier, B.; Long, R.; Wong, C.; Han, S.; Shen, J.; Amieva, M. R.; Ji, H. P.; Huang, R. J.

2026-05-20 cancer biology 10.64898/2026.05.14.725297 medRxiv

Top 0.1%

6.5%

Show abstract

Patients with gastric intestinal metaplasia (GIM), a precancerous lesion, are at high risk for progressing to gastric cancer. Identifying these patients is critical to enable gastric cancer interception. Current approaches rely primarily on histologic evaluation of GIM severity and extent, which may be improved by incorporating molecular features that distinguish high-risk lesions. Our prior single-cell and spatial transcriptomics study identified differentially expressed genes associated with the highest-risk category of GIM. They included ANPEP expressed in enterocytes and CPS1 and OLFM4 expressed in intestinal stem-like or progenitor cells. We evaluated the protein expression and localization of these three markers to understand the cellular features associated with GIM risk and their spatial distribution within metaplastic tissues. Using multiplex immunofluorescence, whole slide image analysis and confocal microscopy, we examined protein expression from 100 tissue biopsies annotated for metaplasia severity using the Operative Link on Gastric Intestinal Metaplasia Assessment (OLGIM) system. Tissue samples included control gastric tissue, GIM, dysplasia and adenocarcinoma. Quantitative whole slide image analysis demonstrated that CPS1 expression had a modest association with disease severity. Although ANPEP was strongly associated with GIM severity, it was also frequently expressed in stromal regions outside epithelial glands. In contrast, OLFM4 expression was largely restricted to epithelial glands and showed a strong association with increased OLGIM severity. These OLFM4-positive epithelial cells were present in discrete glandular foci that expanded with increasing severity of metaplasia. Within individual metaplastic glands, OLFM4 expression was highest at the gland base with decreased expression toward the gland surface. Overall, these findings identified OLFM4 as a protein marker associated with high-risk GIM. The spatial organization of OLFM4-expressing cells at the base of metaplastic glands and their focal expansion within tissues suggest the presence of a stem cell-like epithelial compartment that may contribute to the progression of GIM towards gastric cancer.

19

Non-Genetic Mechanisms of Fractional Resistance to Abemaciclib in Dedifferentiated Liposarcoma.

Bailey, L. E.; Wolff, S. C.; Zikry, T.; Sessions, G. A.; Whitman, A. A.; Titerina, E. K.; Raish, H.; Beane, J.; Purvis, J. E.; Spanheimer, P. M.

2026-05-26 cancer biology 10.64898/2026.05.22.727236 medRxiv

Top 0.1%

6.5%

Show abstract

Dedifferentiated liposarcoma is a rare mesenchymal malignancy driven by amplification of chromosome 12q13-15, which includes the oncogenes CDK4 and MDM2. CDK4 amplification provides a rationale for targeted therapy with CDK4/6 inhibitors, and abemaciclib has shown the most durable activity reported to date in this disease. Clinical responses, however, are incomplete and often transient, and the cellular features that allow tumor cells to continue proliferating during treatment are not well understood. To address this gap, we performed multiplexed single-cell imaging to quantify 17 cell-cycle regulators in both dedifferentiated liposarcoma cell line Lipo246 and surgically resected primary human cells exposed to abemaciclib. Both models contained a subpopulation of cells that retained phosphorylated retinoblastoma protein, a marker of cell proliferation, at the highest abemaciclib doses. These fractionally resistant cells were defined by selective enrichment of cyclin-dependent kinase 2 (CDK2), cyclin B1, and phosphorylated ribosomal protein S6 (pS6), and showed enhanced sensitivity to the CDK2 inhibitor, tagtociclib. Together, these findings reveal nongenetic cell cycle plasticity as a mechanism of escape from CDK4/6 inhibition in dedifferentiated liposarcoma and nominate CDK2 and the PI3K-mTOR pathway as candidate targets for combination therapy.

20

Determination of the practical utility of ESMO Scale for Clinical Actionability of molecular Targets (ESCAT): mapping OncoKB level 1 alterations using ESCAT

Kordes, M.; Chakravarty, D.; Boberg, E.; Creignou, M.; de Petris, L.; Karlsson, C.; Burstrom, L. L.; Suehnholz, S.; Yachnin, J.; Wiklander, O. P.; Haglund de Flon, F.

2026-05-20 oncology 10.64898/2026.05.16.26353390 medRxiv

Top 0.1%

6.4%

Show abstract

Background. The European Society for Medical Oncology (ESMO) Scale for Clinical Actionability of molecular Targets (ESCAT) ranks genomic alterations by the evidence supporting the predictive value of the molecular target for response to targeted therapies. No openly available, systematically curated set of standard care biomarkers mapped to the ESCAT framework exists to support clinical decision-making or harmonize biomarker interpretation. Methods. We mapped all OncoKBTM Level 1 biomarkers to ESCAT tiers using evidence cited by OncoKBTM, excluding abstract-only data. Eight board-certified oncologists and hematologists independently assigned ESCAT tiers, with discrepancies resolved through structured consensus meetings. Recurring evidence scenarios that did not correspond to any existing ESCAT tier informed a set of a priori defined modifications, which were subsequently applied to biomarkers that could not be classified using native ESCAT criteria. Results. Of 188 OncoKBTM Level 1 biomarkers, 16 were excluded due to abstract-only evidence. Using native ESCAT criteria, 51% of the remaining biomarkers were classified as Tier 1, 3% Tier 2, 18% Tier 3, 6% Tier X and 22% could not be assigned to any tier. Applying the modified ESCAT criteria resolved all previously unclassifiable biomarkers and increased Tier 1 assignments to 73%. Inter-rater reliability (Krippendorffs alpha) was moderate (0.586) and 62% of classifications required consensus discussions. Comparison with ESCAT tiers reported in ESMO Clinical Practice Guidelines showed improved concordance when using the modified criteria. Conclusions. The native ESCAT criteria are highly stringent, resulting in many FDA-recognized, clinically validated biomarkers that are currently assigned level 1 by OncoKBTM not mapping to any existing tier. Our predefined modifications improved alignment with OncoKBTM Level 1 designations and with published ESMO clinical practice guidelines. The mapped set of standard care biomarkers are provided on the OncoKBTM website, offering a practical resource that harmonizes ESCAT tiers of evidence with a widely adopted levels of evidence schema.